Development and experimentation with synthetic visible speech

نویسندگان

MICHAEL M. COHEN

DOMINIC W. MASSARO

چکیده

Speech perception and speech production are skills that rival other impressive human achievements. Even after decades of intense effort, speech recognition by machine remains far inferior to that of human performance. Also, as we have all experienced, speech synthesis has not yet obtained the naturalness and quality of the speech of even a young child. Our research has led us to the view that speech perception is easy for human beings because multiple sources of information support the identification and interpretation of the language input. Therefore, it is important to define the sources of information and to determine how they are evaluated and integrated to achieve recognition. The paradigm that we have developed permits us to determine which of the many potentially functional cues are actually used by human observers. The systematic variation of properties of the speech signal, combined with quantitative tests of models based on different sources of information, enables the investigator to test the psychological validity of different cues. Thus, our research strategy not only addresses how different sources of information are evaluated and integrated, it can uncover what sources of information are actually used. We believe that the research paradigm confronts both the important psychophysical question of the nature of information and the process question of how the information is transformed and mapped into behavior. Valuable and effective information is afforded by a view of the speaker's face in speech perception and recognition by humans. Visible speech is particularly effective

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synthesis of visible speech

Given the importance of visible information in face-toface communication, visible speech synthesis is being developed to control and manipulate visible speech. Experiments have shown that this visible speech is particularly important when the auditory speech is degraded, because of noise, bane:width filtering, or hearing impairment (Massaro, 1987). The strong influence of visible speech is not ...

متن کامل

Training Baldi to be multilingual: A case study for an Arabic Badr

In this paper, we describe research to extend the capability of an existing talking head, Baldi, to be multilingual. We use parsimonious client/server architecture to impose autonomy in the functioning of an auditory speech module and a visual speech synthesis module. This scheme enables the implementation and the joint application of text-to-speech synthesis and facial animation in many langua...

متن کامل

Reducing spectral mismatches in concatenative speech synthesis via systematic database enrichment

This paper presents work performed for the Time-Domain TTS system, which is being developed at the ILSP for the Greek language. It focuses on the enhancement of the synthetic speech quality, by reducing the spectral mismatches between concatenated segments. To that end, a study has been performed to determine the distance that can best predict when a spectral mismatch is audible. Experimentatio...

متن کامل

From Speech is Special to Computer Aided Language Learning

After describing the belief that speech is special, empirical and theoretical research is reviewed undermining the tenets of this belief. A new framework is presented as a theoretical framework for language learning. Central to this framework is the natural ease of multimodal perception, particularly the value of visible speech. The value of synthetic talking heads is described along with their...

متن کامل

Experimental design study of RB 255 photocatalytic degradation under visible light using synthetic Ag/TiO2 nanoparticles: Optimization of experimental conditions

In the present study, silver-doped TiO2 (Ag/TiO2) nanoparticles were prepared by various Ag doping (wt%) and a combination of sol-gel and ultrasound irradiation. Ag/TiO2 nanoparticles were characterized by energy-dispersive X-ray analysis (EDX), scanning electron microscopy (SEM) and X-ray diffraction (XRD) techniques. Based on the Taguchi method, photocatalytic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1994

Development and experimentation with synthetic visible speech

نویسندگان

چکیده

منابع مشابه

Synthesis of visible speech

Training Baldi to be multilingual: A case study for an Arabic Badr

Reducing spectral mismatches in concatenative speech synthesis via systematic database enrichment

From Speech is Special to Computer Aided Language Learning

Experimental design study of RB 255 photocatalytic degradation under visible light using synthetic Ag/TiO2 nanoparticles: Optimization of experimental conditions

عنوان ژورنال:

اشتراک گذاری